Asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size.
نویسندگان
چکیده
The distributions of coalescence times and ancestral lineage numbers play an essential role in coalescent modeling and ancestral inference. Both exact distributions of coalescence times and ancestral lineage numbers are expressed as the sum of alternating series, and the terms in the series become numerically intractable for large samples. More computationally attractive are their asymptotic distributions, which were derived in Griffiths (1984) for populations with constant size. In this article, we derive the asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size. For a sample of size n, denote by Tm the mth coalescent time, when m + 1 lineages coalesce into m lineages, and An(t) the number of ancestral lineages at time t back from the current generation. Similar to the results in Griffiths (1984), the number of ancestral lineages, An(t), and the coalescence times, Tm, are asymptotically normal, with the mean and variance of these distributions depending on the population size function, N(t). At the very early stage of the coalescent, when t → 0, the number of coalesced lineages n - An(t) follows a Poisson distribution, and as m → n, $$n\left(n-1\right){T}_{m}/2N\left(0\right)$$ follows a gamma distribution. We demonstrate the accuracy of the asymptotic approximations by comparing to both exact distributions and coalescent simulations. Several applications of the theoretical results are also shown: deriving statistics related to the properties of gene genealogies, such as the time to the most recent common ancestor (TMRCA) and the total branch length (TBL) of the genealogy, and deriving the allele frequency spectrum for large genealogies. With the advent of genomic-level sequencing data for large samples, the asymptotic distributions are expected to have wide applications in theoretical and methodological development for population genetic inference.
منابع مشابه
Coalescence computations for large samples drawn from populations of time-varying sizes
We present new results concerning probability distributions of times in the coalescence tree and expected allele frequencies for coalescent with large sample size. The obtained results are based on computational methodologies, which involve combining coalescence time scale changes with techniques of integral transformations and using analytical formulae for infinite products. We show applicatio...
متن کاملAn ancestral recombination graph for diploid populations with skewed offspring distribution.
A large offspring-number diploid biparental multilocus population model of Moran type is our object of study. At each time step, a pair of diploid individuals drawn uniformly at random contributes offspring to the population. The number of offspring can be large relative to the total population size. Similar "heavily skewed" reproduction mechanisms have been recently considered by various autho...
متن کاملar X iv : 0 90 1 . 10 66 v 1 [ st at . A P ] 8 J an 2 00 9 Taxon Size Distribution in a Time - Homogeneous Birth and Death Process
The number of extant individuals within a lineage, as exemplified by counts of species numbers across genera in a higher taxonomic category, is known to be a highly skewed distribution. Because the sublineages (such as genera in a clade) themselves follow a random birth process, deriving the distribution of lineage sizes involves averaging the solutions to a birth and death process over the dis...
متن کاملOn the genealogy and coalescence times of Bienaym\'e-Galton-Watson branching processes
Coalescence processes have received a lot of attention in the context of conditional branching processes with fixed population size and nonoverlapping generations. Here we focus on similar problems in the context of the standard unconditional Bienaymé-Galton-Watson branching processes, either (sub)-critical or supercritical. Using an analytical tool, we derive the structure of some counting asp...
متن کاملThe congruence between matrilineal genetic (mtDNA) and geographic diversity of Iranians and the territorial populations
Objective(s):From the ancient era, emergence of Agriculture in the connecting region of Mesopotamia and the Iranian plateau at the foothills of the Zagros Mountains, made Iranian gene pool as an important source of populating the region. It has differentiated the population spread and different language groups. In order to trace the maternal genetic affinity between Iranians and other populatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genetics
دوره 194 3 شماره
صفحات -
تاریخ انتشار 2013